智能论文笔记

Robust Distributed Optimization With Randomly Corrupted Gradients

Berkay Turan , Cesar A. Uribe , Hoi-To Wai , Mahnoosh Alizadeh

分类：机器学习 | (统计)机器学习

2021-06-28

在本文中，我们提出了一种一阶分布式优化算法，该算法对拜占庭式失败 - 肢体和潜在的对抗性行为非常强大，在该行为中，所有参与的药物都容易发生失败。我们随着时间的推移将每个代理的状态建模为两国马尔可夫链，该链在不同时间时指示拜占庭或可信赖的行为。我们在任何给定时间均未设置对拜占庭代理的最大数量的限制。我们根据三层防御设计我们的方法：1）时间稳健聚集，2）空间稳健聚集和3）梯度归一化。我们研究了两个用于随机优化的设置，即样品平均近似值和随机近似。我们提供了强烈凸出和平滑非凸成本功能的方法的收敛保证。

translated by 谷歌翻译

Parameter and Feature Selection in Stochastic Linear Bandits

Ahmadreza Moradipari , Berkay Turan , Yasin Abbasi-Yadkori , Mahnoosh Alizadeh , Mohammad Ghavamzadeh

分类：机器学习

2021-06-09

我们研究了随机线性匪徒（LB）中的两个模型选择设置。在我们将其称为特征选择的第一个设置中，LB问题的预期奖励是$ M $特征映射（模型）中至少一个的线性跨度。在第二个设置中，LB问题的奖励参数由$ \ MATHBB r ^ d $中表示（可能）重叠球的$ M $模型任意选择。但是，该代理只能访问错过模型，即球的中心和半径的估计。我们将此设置称为参数选择。对于每个设置，我们开发和分析一种基于从匪徒减少到全信息问题的算法。这允许我们获得遗憾的界限（最多超过$ \ sqrt {\ log m} $ factor）而不是已知真实模型的情况。我们参数选择算法的遗憾也以模型不确定性对数进行缩放。最后，我们经验展现了使用合成和现实世界实验的算法的有效性。

translated by 谷歌翻译

Guiding continuous operator learning through Physics-based boundary constraints

Nadim Saad , Gaurav Gupta , Shima Alizadeh , Danielle C. Maddix

分类：机器学习

2022-12-14

Boundary conditions (BCs) are important groups of physics-enforced constraints that are necessary for solutions of Partial Differential Equations (PDEs) to satisfy at specific spatial locations. These constraints carry important physical meaning, and guarantee the existence and the uniqueness of the PDE solution. Current neural-network based approaches that aim to solve PDEs rely only on training data to help the model learn BCs implicitly. There is no guarantee of BC satisfaction by these models during evaluation. In this work, we propose Boundary enforcing Operator Network (BOON) that enables the BC satisfaction of neural operators by making structural changes to the operator kernel. We provide our refinement procedure, and demonstrate the satisfaction of physics-based BCs, e.g. Dirichlet, Neumann, and periodic by the solutions obtained by BOON. Numerical experiments based on multiple PDEs with a wide variety of applications indicate that the proposed approach ensures satisfaction of BCs, and leads to more accurate solutions over the entire domain. The proposed correction method exhibits a (2X-20X) improvement over a given operator model in relative $L^2$ error (0.000084 relative $L^2$ error for Burgers' equation).

translated by 谷歌翻译

FactorJoin: A New Cardinality Estimation Framework for Join Queries

Ziniu Wu , Parimarjan Negi , Mohammad Alizadeh , Tim Kraska , Samuel Madden

分类：机器学习

2022-12-11

Cardinality estimation is one of the most fundamental and challenging problems in query optimization. Neither classical nor learning-based methods yield satisfactory performance when estimating the cardinality of the join queries. They either rely on simplified assumptions leading to ineffective cardinality estimates or build large models to understand the data distributions, leading to long planning times and a lack of generalizability across queries. In this paper, we propose a new framework FactorJoin for estimating join queries. FactorJoin combines the idea behind the classical join-histogram method to efficiently handle joins with the learning-based methods to accurately capture attribute correlation. Specifically, FactorJoin scans every table in a DB and builds single-table conditional distributions during an offline preparation phase. When a join query comes, FactorJoin translates it into a factor graph model over the learned distributions to effectively and efficiently estimate its cardinality. Unlike existing learning-based methods, FactorJoin does not need to de-normalize joins upfront or require executed query workloads to train the model. Since it only relies on single-table statistics, FactorJoin has small space overhead and is extremely easy to train and maintain. In our evaluation, FactorJoin can produce more effective estimates than the previous state-of-the-art learning-based methods, with 40x less estimation latency, 100x smaller model size, and 100x faster training speed at comparable or better accuracy. In addition, FactorJoin can estimate 10,000 sub-plan queries within one second to optimize the query plan, which is very close to the traditional cardinality estimators in commercial DBMS.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Gemino: Practical and Robust Neural Compression for Video Conferencing

Vibhaalakshmi Sivaraman , Pantea Karimi , Vedantha Venkatapathy , Mehrdad Khani , Sadjad Fouladi , Mohammad Alizadeh , Frédo Durand , Vivienne Sze

分类：计算机视觉

2022-09-21

当网络条件恶化时，视频会议系统的用户体验差，因为当前的视频编解码器根本无法在极低的比特率下运行。最近，已经提出了几种神经替代方案，可以使用每个框架的稀疏表示，例如面部地标信息，以非常低的比特率重建说话的头视频。但是，这些方法在通话过程中具有重大运动或遮挡的情况下会产生不良的重建，并且不会扩展到更高的分辨率。我们设计了Gemino，这是一种基于新型高频条件超分辨率管道的新型神经压缩系统，用于视频会议。 Gemino根据从单个高分辨率参考图像中提取的信息来增强高频细节（例如，皮肤纹理，头发等），为每个目标框架的一个非常低分辨率的版本（例如，皮肤纹理，头发等）。我们使用多尺度体系结构，该体系结构在不同的分辨率下运行模型的不同组件，从而使其扩展到可与720p相当的分辨率，并且我们个性化模型以学习每个人的特定细节，在低比特率上实现了更好的保真度。我们在AIORTC上实施了Gemino，这是WEBRTC的开源Python实现，并表明它在A100 GPU上实时在1024x1024视频上运行，比比特率的比特率低于传统的视频Codecs，以相同的感知质量。

translated by 谷歌翻译

Soil Erosion in the United States. Present and Future (2020-2050)

Shahab Aldin Shojaeezadeh , Malik Al-Wardy , Mohammad Reza Nikoo , Mehrdad Ghorbani Mooselu , Mohammad Reza Alizadeh , Jan Franklin Adamowski , Hamid Moradkhani , Nasrin Alamdari , Amir H. Gandomi

分类：机器学习

2022-07-14

土壤侵蚀是对世界各地环境和长期土地管理的重大威胁。人类活动加速的土壤侵蚀会造成陆地和水生生态系统的极端变化，这在现场阶段（30-m）的当前和可能的未来没有得到充分的调查/预测。在这里，我们使用三种替代方案（2.6、4.5和8.5）估计/预测通过水侵蚀（薄板和RILL侵蚀）的土壤侵蚀速率，共享社会经济途径和代表性浓度途径（SSP-RCP）情景。田间尺度的土壤侵蚀模型（FSSLM）估计依赖于由卫星和基于图像的土地使用和土地覆盖的估计（LULC）集成的高分辨率（30-m）G2侵蚀模型，对长期降水量的规范观察，以及耦合模型比较项目阶段6（CMIP6）的方案。基线模型（2020年）估计土壤侵蚀速率为2.32 mg HA 1年1年，具有当前的农业保护实践（CPS）。当前CPS的未来情况表明，在气候和LULC变化的SSP-RCP方案的不同组合下，增加了8％至21％。 2050年的土壤侵蚀预测表明，所有气候和LULC场景都表明极端事件的增加或极端空间位置的变化很大程度上从南部到美国东部和东北地区。

translated by 谷歌翻译

COIN++: Neural Compression Across Modalities

Emilien Dupont , Hrushikesh Loya , Milad Alizadeh , Adam Goliński , Yee Whye Teh , Arnaud Doucet

分类：机器学习 | 计算机视觉 | (统计)机器学习

2022-01-30

神经压缩算法通常基于需要专门编码器和解码器体系结构的自动编码器，以实现不同的数据模式。在本文中，我们提出了Coin ++，这是一种神经压缩框架，无缝处理广泛的数据模式。我们的方法基于将数据转换为隐式神经表示，即映射坐标（例如像素位置）为特征（例如RGB值）的神经函数。然后，我们不用直接存储隐式神经表示的权重，而是存储应用于元学习的基础网络作为数据的压缩代码的调制。我们进一步量化和熵代码这些调制，从而导致大量压缩增益，同时与基线相比，将编码时间缩短了两个数量级。我们通过压缩从图像和音频到医学和气候数据的各种数据方式来证明我们方法的有效性。

translated by 谷歌翻译

CausalSim: Toward a Causal Data-Driven Simulator for Network Protocols

Abdullah Alomar , Pouya Hamadanian , Arash Nasr-Esfahany , Anish Agarwal , Mohammad Alizadeh , Devavrat Shah

分类：机器学习 | 人工智能 | (统计)机器学习

2022-01-05

评估网络协议的真实表现是具有挑战性的。随机控制试验（RCT）对大多数研究人员来说是昂贵的并且无法进入，而专业设计的模拟器则无法捕获真实网络中的复杂行为。我们呈现MaunAlim，一种数据驱动的模拟器，用于解决这一挑战的网络协议。由于数据收集期间使用的协议引入的偏差，从观察数据中学习网络行为是复杂的。 MakAlAIM在一组协议下使用来自初始RCT的迹线来学习因果网络模型，有效地去除数据中存在的偏差。然后，使用此模型，可以在同一迹线上模拟任何协议（即，用于反事实预测）。因果的关键是对来自来自RCT的训练数据引起的分布修正因的对抗性神经网络培训进行了新的使用。我们对实际和合成数据集的MAURALAIM的广泛评估以及来自河豚视频流系统的两种用例，包括来自河豚视频流系统的超过九个月的实际数据，表明它提供了准确的反事预测，将预测误差降低了44％和53％平均值与专家设计和标准的监督学习基线相比。

translated by 谷歌翻译

Efficient Strong Scaling Through Burst Parallel Training

Seo Jin Park , Joshua Fried , Sunghyun Kim , Mohammad Alizadeh , Adam Belay

分类：计算机视觉 | 机器学习

2021-12-19

由于新兴的深度神经网络（DNN）模型的规模继续增大，使用大型GPU集群培训DNN是实现可接受培训时间的基本要求。在本文中，我们考虑了集群大小的未来增加的情况将导致全局批量大小用于培训模型以达到基本限制：超出某个点，更大的全球批量尺寸会导致样品效率降低，总体上升准确性的时间。因此，为了实现培训性能的进一步改进，我们必须考虑“强大的缩放”策略，该策略保持全局批量大小常量，并将较小的批次分配给每个GPU。不幸的是，这使得能够有效地使用群集资源。我们呈现DeepPool，通过两个关键思想解决这种效率挑战的系统。首先，突发并行性将大量GPU分配给突发中的前景作业，以利用整个层的并行性的不均匀性。其次，GPU多路复用优先考虑前台培训工作的吞吐量，而背景培训作业包装以回收未充分利用的GPU资源，从而提高集群范围利用率。这两个想法在一起使DeepPool能够在群集刻度大的单一任务中通过标准数据并行度进行2.2 - 2.4倍的完整性。

translated by 谷歌翻译